2.8 Numpy中的快速排序：np.sort 和 np.argsort

本文源码请见我的GitHub

2.8.1 Numpy中的快速排序：np.sort 和 np.argsort

1	import numpy as np

1 2	x = np.array([2,1,5,4,3])#不修改原始数组的基础上返回一个排好的数组 np.sort(x)

array([1, 2, 3, 4, 5])

1 2	x.sort() print(x)#用排好的数组代替原始数组

[1 2 3 4 5]

#返回原始数组排好的索引值：
x = np.array([2,1,4,3,5])
i = np.argsort(x)
print(i)

[1 0 3 2 4]

沿着行或列排序

1
2
3

rand =np.random.RandomState(42)
X =np.random.randint(0,10,(4,6))
print(X)

[[6 1 9 3 3 7]
 [9 7 4 9 4 5]
 [2 9 6 3 3 1]
 [2 7 1 0 6 3]]

1 2	#对X的每一列排序 np.sort(X,axis=0) #从大到小

array([[2, 1, 1, 0, 3, 1],
       [2, 7, 4, 3, 3, 3],
       [6, 7, 6, 3, 4, 5],
       [9, 9, 9, 9, 6, 7]])

1 2	#对行排序 np.sort(X,axis=1)

array([[1, 3, 3, 6, 7, 9],
       [4, 4, 5, 7, 9, 9],
       [1, 2, 3, 3, 6, 9],
       [0, 1, 2, 3, 6, 7]])

1	'''上述处理将行或者列当作独立的数组，任何行或列的值之间的关系将会丢失'''

'上述处理将行或者列当作独立的数组，任何行或列的值之间的关系将会丢失'

2.8.2 部分排序：分隔

1
2
3

#有时候不需要排序，二十找到数组中第K小的值：np.partition
x = np.array([7,2,3,1,6,5,4])
np.partition(x,3)#第几小的数 这里是第三小 输出结果是新数组，最左边是第k小的值，往右是任意顺序的其他值

array([2, 1, 3, 4, 6, 5, 7])

1 2	#也可以沿着多维数组任意的轴进行分隔 np.partition(X,2,axis=1)

array([[1, 3, 3, 6, 9, 7],
       [4, 4, 5, 9, 7, 9],
       [1, 2, 3, 6, 3, 9],
       [0, 1, 2, 7, 6, 3]])

2.8.3 demo:K个最近邻

1	import numpy as np

X = rand.rand(10,2)
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn;seaborn.set()#设置绘画风格
plt.scatter(X[:,0],X[:,1],s = 100);

1	dist_sq = np.sum((X[:,np.newaxis,:] - X[np.newaxis,:,:]) ** 2 , axis= -1)

1 2	difference = X[:, np.newaxis, :] - X[np.newaxis, : , :] difference.shape

(10, 10, 2)

1 2	sq_differences = difference ** 2 sq_differences.shape

(10, 10, 2)

1 2	dist_sq = sq_differences.sum(-1) dist_sq.shape

(10, 10)

1	dist_sq.diagonal()

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

1 2	nearest = np.argsort(dist_sq, axis=1) print(nearest)

[[0 3 4 5 8 1 9 7 2 6]
 [1 4 6 9 8 0 7 3 2 5]
 [2 7 9 8 6 4 3 1 0 5]
 [3 5 0 8 4 9 7 2 1 6]
 [4 1 0 8 9 6 3 5 7 2]
 [5 3 0 8 4 9 1 7 2 6]
 [6 1 9 4 8 7 2 0 3 5]
 [7 2 9 8 6 4 1 3 0 5]
 [8 9 4 7 2 3 0 1 5 6]
 [9 8 7 2 6 1 4 0 3 5]]

1 2	K = 2 nearest_partition = np.argpartition(dist_sq, K + 1, axis= 1)

1	plt .scatter(X[:,0],X[:,1], s= 100)

<matplotlib.collections.PathCollection at 0x2a2fa444be0>

#将每个点与他最近的两个点相连
for i in range(X.shape[0]):
    for j in nearest_partition[i, :K+1]:
        plt.plot(*zip(X[j], X[i]), color = "gray")